Covid-19 vs. FAANG Stock Prices -- Analysis and Prediction

CMSC320 Final Project

Bingying Jiang

Introduction

It's been almost two years since the outbreak of the COVID-19 pandemic. At its roots, the COVID-19 crisis is a global health crisis that had affected and changed our lives, it's not a financial or economic crisis. However, due to its huge effects on supply and demand conditions, it's inevitable that the COVID-19 crisis turned into a large-scale economic crisis. In this project, I'd like to explore and study the correlation between the effects of the COVID-19 cases and death on the FAANG stock prices between January 2020 to December 2021 by using Machine Learning models. Though the stock market's volatility could be affected in many aspects, as a barometer for the path of pandemics, the effect of this health crisis plays a vital role. In this article, Elaine Loh concluded that airline stock prices were more vulnerable to the influences of pandemic crises since people tend to reduce travel in that period. On the face of it, technology companies won't be affected much because they don't have direct relations. In this project, I think it would be fun to find out if there's a correlation between the COVID-19's daily cases&deaths and FAANG stock prices.

Imports

Plotly is the main tool for visualization in this project. It could easily make Professional and interactive figures in just a few lines of code, I personally think it works better than seaborn. Compared to seaborn, it allows more flexibility and customization. You might want to check more ways to create fun figures from Plotly website. Besides pandas and numpy, keras library is also used in the Machine Learning works. You probably noticed that I also installed tensorflow, which takes a while to be downloaded if you're running it on a virtual machine. I know it's a powerful tool in Machine Learning field, unfortunately, it's not used in this project, and the installation of Tensorflow is just a way to fix the keras' installation problem.

Data Collection

The first step to data analysis is to choose and obtain a data set. In this case, I need up-to-date datasets of both Covid-19 and FAANG stock prices. I obtained world daily updating covid-19 data from Our World in Data's Github site. Since it's a huge dataset containing all the information from worldwide, I have to clean it before analysis. Nasdaq's website has nearly perfect data for the stock market. This dataset comes with information on the open, close, high, low, and volume of FAANG stock prices in the past five years. Unfortunately, it doesn't come with the data for the adjusted close price. Adjusted closing price refers to the price of the stock after paying off the dividends. Compared to the close price in this dataset, the adjusted closing price tends to give out a better idea of the overall value of the stock.

Data Cleaning and Processing

It's a huge dataset contain worldwide case information about covid data. In this article, I only intended to analysis the impact of US covid cases and death on US stock. I picked the the period of data is from Jan 2020 to Dec 2021, apply to all datasets to keep the ensure the consistency of date's length and format. And then, I also calculated standard deviation of chosen columns, prepared for analysis between them.

After checking the covid dataframe, I've noticed that there're some NaNs needed to be removed or replaced. Repalcing them with zero is better option so it's ready for calculations in next few steps and also avoid loosing data unnecessayly. I also found that the messiness of index needed to be fixed. Meanwhile,Because the date couldn't be calculated, I added a new column called day to count the number of days. The last thing is I personally feel it's proper to have a numble of case in float number,so I convert them to integer.

The datafram above is basically a clean and tide data. Now it's time to standardized daily new cases and new deaths by subtract the column mean and divide by standard deviation to compute standardized values for all columns at the same time. To standardize a dataset means to scale all of the values in the dataset such that the mean value is 0 and the standard deviation is 1. More information about standardizing could be found here.

Next, FAANG stock price data cleaning and processing

FAANG is an acronym used to describe some of the most prominent companies in the tech sector. Originally the acronym was FANG for Facebook (NASDAQ: FB), Amazon (NASDAQ: AMZN), Netflix (NASDAQ: NFLX), and Alphabet (NASDAQ: GOOG) (NASDAQ: GOOGL) (formerly Google).

Since FAANG stands for five tech companies, I made a list for them to reduce code lines. In the pandas libraries, I have to remove the dollar sign and convert the type of close, open, high, and low to float because type str doesn't support daily abnormal stock price calculation. I define the daily abnormal FAANG stock price between January 22, 2020 and December 16, 2021 by subtracting the average price of last twenty three month from the daily price and by dividing the resultant difference from thestandard deviation of the last twenty three months.

Exploratory Analysis and Data Visualization ¶

Comparison between total death and total deaths.

From the plot above, we can tell the rate is almost in an exponential growth,there's positive relationship between total cases and deaths over time, furthermore, there's no sign to flatline.

Relation of daily death and case

The distribution of above plot shows both distrubutions of daily death and daily case. They are almost identical except the distribution of period between May 2020 and June 2020. So the distribution of daily death is multimodal and the distribution of daily case is bimodal, both are symmetric around the center. In this period, the number of daily death achieved a peak but daily case didn't.

FAANG Stock Price

In this section I have made five different graphs that will represent five different attributes about each and every FAANG company. These attributes are : Opening, Closing Prices, Volumes and 14, 21, 100 day moving averages. All of these are very important for investors, they are able to determine whether they should buy or sell a stock based on the values of these attributes. We decided to calculate the different moving averages because a lot of the buyers and sellers base their actions on these averages.

We have added another feature to the graph that shows the standardized volumes in the background of the primary scatterplot. The volumes have been scaled in order to help users see the volumes better. We have also added an option in the dropdown menu where users can choose to see the standardized volume histogram in much more detail.

As you can see the graphs in this section are all interactive and visual. We have made a separate plot for all attributes and a user can select which graph he or she wants to study based on their preference.

In this section we have made five different graphs that will represent price of every FAANG company. We can tell from the plot both Amazon and Google have much higher close price than the rest three. Amazon has the highest close price, and apple has the lowest close price.

Abonormlities of FAANG Stock Price

The plot shows FAANG companies have similar abnomalities, it implies their stock price will be affected in same way

Volatility is defined as how much variation there is in the price of a given stock or index of stocks; simply put, how widely a price can swing up or down. It is generally considered to be a measure of the level of risk in an investment. Typically, low volatility is associated with positive market returns and high volatility with negative market returns. However, volatility can be high when stocks are increasing or decreasing in value. Volatility averages is often within a range of 10-20%.

We have fairly high volatility for FAANG companies, the highest is 38% for Neflix, and the lowest is Google, which has about 27%. Even the lowest one is much higher than the normal range. So, we can expect FAANG companies's stock price might fall in the future. It's risky to invest at this moment.

Data Clean for correlation plot

corraltion between death and close price

Hypothesis Testing:

Hypothesis Testing to Check for Relationship Between daily death and Close Price for FAANG: Hypothesis Test is conducted to see if there is a relationship between daily death and Close Price for FAANG at a 95% Confidence Interval

Null Hypothesis: There is no relationship between daily death and Close Price for FAANG Alternative Hypothesis: There is relationship between time and Close Price for FAANG

If the p-value is greater than 0.05, we fail to reject the null hypothesis. If the p-value is smaller than 0.05, we reject the null hypothesis and accept the alternative hypothesis

The p - value we get is 0.000, which is smaller than the p - value, so we reject the null hypothesis and accept the alternative hypothesis.

As the null hypothesis is rejected, we can conclude that there is a relationship between daily death and Close Price of FAANG.

Predicting Close Price

LSTM Machine Learning

We can tell from the plot, LSTM could predict stock price in a good way. An LSTM module has. allows it to model both long-term and short-term data.

KNN & Linear Regression & Decision Tree

The inference I can draw from this graph is that the k-NN Regressor model gives us one of the most accurate predictions looking at the accuracy scores. Although, there are a couple outliers in the graphs that could possibly change the predictions. However, these outliers happen to be very insignificant and can always change based on the trading model that companies decide to adopt.

Conclusion:

The aim of this project is to evaluate whether COVID-19 cases and deaths,explain and predict FAANG stock market in COVID -19 period. I find that both COVID-19 cases and deaths related to COVID-19 have contemporary relationships and predictive abilities on abnormal stock prices. These shocks affect investment decisions and the subsequent stock price dynamics.